The Rating Game: Sentiment Rating Reproducibility from Text
نویسندگان
چکیده
Sentiment analysis models often use ratings as labels, assuming that these ratings reflect the sentiment of the accompanying text. We investigate (i) whether human readers can infer ratings from review text, (ii) how human performance compares to a regression model, and (iii) whether model performance is affected by the rating “source” (i.e. original author vs. annotator). We collect IMDb movie reviews with author-provided ratings, and have them re-annotated by crowdsourced and trained annotators. Annotators reproduce the original ratings better than a model, but are still far off in more than 5% of the cases. Models trained on annotator-labels outperform those trained on author-labels, questioning the usefulness of author-rated reviews as training data for sentiment analysis.
منابع مشابه
Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating
E-commerce develops rapidly. Learning and taking good advantage of the myriad reviews from online customers has become crucial to the success in this game, which calls for increasingly more accuracy in sentiment classification of these reviews. Therefore the finer-grained review rating prediction is preferred over the rough binary sentiment classification. There are mainly two types of method i...
متن کاملAn Approach to Sentiment Analysis -the Case of Airline Quality Rating
Sentiment mining has been commonly associated with the analysis of a text string to determine whether a corpus is of a negative or positive opinion. Recently, sentiment mining has been extended to address problems such as distinguishing objective from subjective propositions, and determining the sources and topics of different opinions expressed in textual data sets such as web blogs, tweets, m...
متن کاملRestaurants Review Star Prediction for Yelp Dataset
Yelp connects people to great local businesses. In this paper, we focus on the reviews for restaurants. We aim to predict the rating for a restaurant from previous information, such as the review text, the user’s review histories, as well as the restaurant’s statistic. We investigate the data set provided by Yelp Dataset Challenge round 5. In this project, we will predict the star(rating) of a ...
متن کاملAspect and Ratings Inference with Aspect Ratings: Supervised Generative Models for Mining Hotel Reviews
Today, a large volume of hotel reviews is available on many websites, such as TripAdvisor (http://www.tripadvisor.com) and Orbitz (http://www.orbitz.com). A typical review contains an overall rating and several aspect ratings along with text. The rating is perceived as an abstraction of reviewers’ satisfaction in terms of points. Although the amount of reviews having aspect ratings is growing, ...
متن کاملA Text Polarity Analysis Using Sentiwordnet Based an Algorithm
sentiment analysis aim to get the underlying viewpoint of text which could be opinion, online review, movie rating comments etc. the aim of this project is to offer better sentiment text analysis strategy which recognize the polarity of text message including positive, negative, and neutral using sentiwordnet. The contribution of this paper is use POS(parts of speech) tagger to examine specific...
متن کامل